resized_conestoga_logo.png

Foundations of Machine Learning Frameworks  

CSCN8010 - Winter 2024  

Professor: Ran Feldesh  

Student: Arcadio de Paula Fernandez

1. Graph using Matplotlib¶

Titanic departing Southampton on 10 April 1912.

To create the graph below, we will use the plotting library for Python called Matplotlib. As data, we will use the classic Titanic database, containing the number of passengers, age, sex, survivors, etc.

For more information about the Titanic you can access the following link.

The graph is a histogram showing the distribution of the number of passengers and their age.

In [3]:
# Importing several libraries for data visualization
import matplotlib as mpl
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
In [4]:
# Loading the dataset in seaborn data repository of Titanic
df = sns.load_dataset('titanic')

df.head()
Out[4]:
survived pclass sex age sibsp parch fare embarked class who adult_male deck embark_town alive alone
0 0 3 male 22.0 1 0 7.2500 S Third man True NaN Southampton no False
1 1 1 female 38.0 1 0 71.2833 C First woman False C Cherbourg yes False
2 1 3 female 26.0 0 0 7.9250 S Third woman False NaN Southampton yes True
3 1 1 female 35.0 1 0 53.1000 S First woman False C Southampton yes False
4 0 3 male 35.0 0 0 8.0500 S Third man True NaN Southampton no True
In [5]:
# The 'age' column was selected and missing values were dropped by using .dropna()
ages = df['age'].dropna()

#The number of bins were set and to create the histogram
n_bins = 30

# Creating the histogram plot
plt.hist(ages, bins=n_bins, edgecolor="white")


# Setting labels and title
plt.xlabel('Age')
plt.ylabel('Number of Passengers')
plt.title('Histogram of Passenger Ages')

# Showing the plot
plt.show()

2. Graph using Seaborn¶

Titanic at the docks of Southampton.

The graph below is another histogram showing the distribution of the number of passengers and their age, but now in Seaborn, also a Python data visualization library.

In [6]:
import seaborn as sns
In [7]:
# Loading the dataset in seaborn data repository of Titanic
df = sns.load_dataset('titanic')

df.head()
Out[7]:
survived pclass sex age sibsp parch fare embarked class who adult_male deck embark_town alive alone
0 0 3 male 22.0 1 0 7.2500 S Third man True NaN Southampton no False
1 1 1 female 38.0 1 0 71.2833 C First woman False C Cherbourg yes False
2 1 3 female 26.0 0 0 7.9250 S Third woman False NaN Southampton yes True
3 1 1 female 35.0 1 0 53.1000 S First woman False C Southampton yes False
4 0 3 male 35.0 0 0 8.0500 S Third man True NaN Southampton no True
In [8]:
plt.figure(figsize=(10, 6))
sns.histplot(data=df, x='age', kde=True, hue='sex')
plt.title('Age Distribution by Gender')
plt.show()

3. Graph using Plotly¶

The sinking of the Titanic as depicted in Untergang der Titanic, a 1912 illustration by Willy Stöwer.

The graphs below show the number of passengers that survived and died but are now in Plotly Express, also a Python data visualization library.

In [9]:
# Loading the dataset in seaborn data repository of Titanic and saving it in the 
titanic_data = sns.load_dataset('titanic')

# Viewing the first 5 rows
titanic_data.head()
Out[9]:
survived pclass sex age sibsp parch fare embarked class who adult_male deck embark_town alive alone
0 0 3 male 22.0 1 0 7.2500 S Third man True NaN Southampton no False
1 1 1 female 38.0 1 0 71.2833 C First woman False C Cherbourg yes False
2 1 3 female 26.0 0 0 7.9250 S Third woman False NaN Southampton yes True
3 1 1 female 35.0 1 0 53.1000 S First woman False C Southampton yes False
4 0 3 male 35.0 0 0 8.0500 S Third man True NaN Southampton no True
In [10]:
pip install plotly
Requirement already satisfied: plotly in c:\users\arcad\appdata\local\programs\python\python311\lib\site-packages (5.14.1)
Requirement already satisfied: tenacity>=6.2.0 in c:\users\arcad\appdata\local\programs\python\python311\lib\site-packages (from plotly) (8.2.3)
Requirement already satisfied: packaging in c:\users\arcad\appdata\roaming\python\python311\site-packages (from plotly) (23.0)
Note: you may need to restart the kernel to use updated packages.
[notice] A new release of pip is available: 23.2.1 -> 23.3.2
[notice] To update, run: python.exe -m pip install --upgrade pip
In [11]:
import plotly.express as px
import plotly.offline as pyo

pyo.init_notebook_mode()
In [12]:
fig = px.pie(titanic_data, names='survived', title='Passenger Survival',color_discrete_map={'Not Survived': 'red', 'Survived': 'green'},labels={'SurvivalLabel': 'Survival'})
fig.show()
In [13]:
fig = px.scatter(titanic_data, x='fare', y='age', color='survived', size='fare')
fig.show()

Converting your notebook into an HTML:¶

In [14]:
!jupyter nbconvert --to html "C:\Users\arcad\CSCN8010-labs\Lab2-Arca\Class 2_Lab_Arcadio_v3.ipynb" --output-dir ./docs/